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Abstract 

We relate the Burrows- Wheeler transformation with a result in com- 
binatorics on words known as the Gessel-Reutenauer transformation. 

1 Introduction 

The Burrows- Wheeler transformation is a popular method used for text com- 
pression The rough idea is to encode a text in two passes. In the first pass, 
the text to is replaced by a text T(w) of the same length obtained as follows: list 
the cyclic shitfs of w in alphabetic order as the rows Wi , W2, ■ ■ ■ , w n of an array. 
Then T(w) is the last column of the array. In a second pass, a simple encoding 
allows to compress T(w), using a simple method like run- length or move-to-front 
encoding. Indeed, adjacent rows will often begin by a long common prefix and 
T(w) will therefore have long runs of identical symbols. For example, in a text 
in english, most rows beginning with 'nd' will end with 'a'. We refer to for 
a complete presentation of the algorithm and an analysis of its performances. 
It was remarked recently by S. Mantaci, A. Restivo and M. Sciortino JU] that 
this transformation was related with notions in combinatorics on words such 
as Sturmian words. Similar considerations were developped in pp in a different 
context. The results presented here are also close to the ones of 0). 

In this note, we study the transformation from the combinatorial point of 
view. We show that the Burrows- Wheeler transformation is a particular case of 
a bijection due to I.M. Gessel and C. Reutenauer which allows the enumeration 
of permutations by descents and cyclic type (see 

The paper is organized as follows. In the first section, we describe the 
Burrows- Wheeler transformation. The next section describes the inverse of the 
transformation with some emphasis on the computational aspects. The last 
section is devoted to the link with the Gessel-Reutenauer correspondance. 

2 The Burrows- Wheeler transformation 

The principle of the method is very simple. We consider an ordered alphabet A. 
Let w — a\a,2 ■ ■ ■ a n be a word of length n on the alphabet A. The Parikh vector 



of a word w on the alphabet A is the integer vector v — (ni, ri2, ■ ■ ■ , rife) where 
n, is the number of occurrences of the i-th letter of 4 in to. We suppose w to 
be primitive, i.e. that w is not a power of another word. Let w\, W2, ■ ■ ■ , w n be 
the sequence of conjugates of w in increasing alphabetic order. Let bi denote 
the last letter of Wi, for i = 1, . . . , n. Then the Burrows- Wheeler transform of 
w is the word T(w) = &1&2 ■ ■ ■ b n . 

Example 1 Let w = abracadabra. The list of conjugates of w sorted in alpha- 
betical order is represented below. 
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The word T(w) is the last column of the array. Thus T(w) = rdarcaaaabb. 

It is clear that T(w) depends only on the conjugacy class of w. Therefore, in 
order to study the correspondance w 1— ► T(w), we may suppose that w is a 
Lyndon word, i.e. that w = w\. Let a denote the first letter of Wi. Thus the 
word z — C1C2 ■ • • c„ is the nondecreasing rearrangement of w (and of T(w)). 

Let g be the permutation of the set {1, . . . , n) such that a(i) = j iff Wj — 
aia i+1 ■ ■ -Oi-i. In other terms, a(i) is the rank in the alphabetic order of the 
i-th circular shift of the word w. 

Example 1 (continued) 
We have 

_ / 1 2 3 4 5 6 7 8 9 10 11 \ 
^^137 11 48592 6 10 J 

By definition, we have for each index i with 1 < i < n 

ai=c a{i y (1) 

We also have the following formula expressing T(w) using a 

b l = a a -i {€) _ 1 (2) 

Indeed, b a ^ is the last letter of = ajdj+i • ■ ■Oj-ii whence b a ^ = aj-i 

which is equivalent to the above formula. 



2 



Let 7r = P(w) be the permutation defined by n(i) = a(a^ 1 (i) + 1) where 
the addition is to be taken modn. Actually, ir is just the permutation obtained 
by writing a as a word and interpreting it as an n-cycle. Thus, we have also 
a(i) = 7r i_1 (l) and 

Oi = (Vi-i(i) (3) 

Example 1 (continued) 

We have, written as a cycle 

tt=(137 11 485926 10 ) 

/12345 6 7 89 10 11 
and as an array ^-^35789 101152 1 4 

Substituting in Formula J2J the value of a 2 ; given by Formula JJJ, we obtain 
bi = Cv(a— 1 (i)—i) which is equivalent to 

Ct = K(e, (4) 

Thus the permutation w transforms the last column of the array of conjugates 
of w into the first one. Actually, it can be noted that w transforms any column 
of this array into the following one. 

The computation of T(w) from w can be done in linear time. Indeed, pro- 
vided w is chosen as a Lyndon word, the order between the conjugates is the 
same as the order between the corresponding suffixes. The computation of the 
permutation a results from the suffix array of w which can be computed in 
linear time [3] on a fixed alphabet. The corresponding result on the alphabet 
of integers is a more recent result. It has been proved independently by three 
groups of researchers, [7] , |S] and |S] . 



3 Inverse transformation 

We now show how w can be recovered from T(w). For this, we introduce the 
following notation. The rank of i in the word y = b\bi ■ ■ ■ b n , denoted rank(i, y) 
is the number of occurrences of the letter bi in 6162 • • ■ bi . 

We observe that for each index i, and for the aforementioned words y = 
b\b2 ■ ■ ■ b n and z — C1C2 ■ ■ ■ c n 

rank(i,z) = rank(7r(i), y). (5) 

Indeed, we first note that for two words u, v of the same length and any letter 
a, one has au < av ua < va (<^> u < v). Thus for all indices i,j 

i < j and Ci = cj 7r(i) < (6) 

Hence, the number of occurrences of Cj in C\c% ■ • ■ Ci is equal to the number of 
occurrences of b^r^ = Ci in &162 • • • b^uy 
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To obtain w from T{w) = 6162 • • • b n , we first compute z = c\Ci • ■ ■ c n by 
rearranging the letters hi in nondecreasing order. Property (J5J shows that n(i) 
is the index j such that c, = bj and rank(j, y) = rank(i,z). This defines the 
permutation 7T, from which a can be reconstructed. An algorithm computing ir 
from y = T(w) is represented below. 

Permutation(6i6 2 • • • b n ) 

1 c <— sort(6i6 2 ■ • ■ b n ) 

2 for i <— 1 to n do 

3 if i = 1 or Cj_x 7^ Cj then 

4 j <- 

5 do + l 

6 while bj ^ c,; 

7 7r(i) <- j 

8 return 7r 

This algorithm can be optimized to a linear-time algorithm by storing the first 
position of each symbol in the word z. 

Finally w Ccin be recovered, from z — C1C2 * * * c n and 7r by Formula J3J. The 
algorithm allowing to recover w is represented below. 

Word(z,7t) 

1 j*-l 

2 a% < — c\ 

3 for i *— 2 to n do 

4 j <- 7T(j) 

5 aj <— 

6 return w 

The computation of w is not possible without the Parikh vector or equivalently 
the word z. One can however always compute the word w on the smallest possi- 
ble alphabet associated with permutation 7r (this is the computation described 
in HI). 

4 Descents of permutations 

A descent of a permutation 7r is an index % such that 7r(i) > ir(i + 1). We denote 
by des(7r) the set of descents of the permutation ir. It is clear by Property |JBJ 
that if i is a descent of P(w), then a ^ c;+i. Thus, the number of descents of 
7T is at most equal to k — 1 where k is the number of symbols appearing in the 
word w. 

Example 1 (continued) The descents of ir appear in boldface. 

/12345 6 7 89 10 11 \ 
7r ~l v 36789 10 11 5 2 1 4 ) 

Thus des(Tr) = {7,8,9}. 
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Let us fix an ordered alphabet A with fc elements for the rest of the paper. 
Let w be a word and v = {ni 1 n 2l ■ ■ ■ ,nt) be the Parikh vector of w. We say 
that v is positive if m > for i = 1, 2, . . . , fc. We denote by p(v) the set of 

integers p{v) = {rti, n\ + 71%, . . . , Tlx H H rife_i}. When w is positive, p(v) has 

fc — 1 elements. Let n = P(w) and let v be the Parikh vector of w. It is clear 
by Formula that we have the inclusion des(7r) c p(v). 

Example 1 (continued) The Parikh vector of the word w — abracadabra is 
v = (5, 2, 1, 1, 2) and p(v) = {5, 7, 8, 9}. 

The following statement results from the preceding considerations. 

Theorem 1 For any positive vector v — (rii, 71%, ■ ■ ■ ,Uf.) with n = n\ + • • •+rik, 
the map »i h ir = P(w) is one to one from the set of conjugacy classes of 
primitive words of length n on A with Parikh vector v onto the set of cyclic 
permutations on {1, 2, ... ,71} such that p(v) contains des(7r). 

This result is actually a particular case of a result stated in [3] and essentially 
due to I. Gessel and C. Reutenauer 0. The complete result Theorem 11.6.1 
p. 378) establishes a bijection between words of type A and pairs (tt, E) where 7r 
is a permutation of type A and E is a subset of{l,2,...,n — 1} with at most fc — 1 
elements containing des(7r). The type of a word w of length n is the partition 
of n realized by the length of the factors of its nonincreasing factorization in 
Lyndon words. The type of a permutation is the partition resulting of the length 
of its cycles. Thus, Theorem Q] corresponds to the case where w is a Lyndon 
word (i.e. A has only one part) and tt is circular. 

We illustrate the general case of an arbitrary word with an example for 
the sake of clarity. For example, the word w = abaab has the nonincreasing 
factorization in Lyndon words w = (ab)(aab). Thus w has type (3,2). The 
corresponding permutation of type (3, 2) is tt = (35)(124). Actually, the permu- 
tation 7r is obtained as follows. Its cycles correspond to the Lyndon factors of 
w. The letters are replaced by the rank in the lexicographic order of the cyclic 
iterates of the conjugates. In our example, we obtain 

1 a a b a a b 

2 a b a a b a 

3 a b a b a b 

4 b a a b a a 

5 b a b a b a 

We have des(7r) = {3} which is actually included in p(v) — {3, 5}. 

We may observe that when the alphabet is binary, i.e. when fc = 2, The- 
orem Stakes a simpler form: the map w 1— > P(w) is one-to-one from the set 
of primitive binary words of length n onto the set of circular permutations on 
{1, 2, . . . , n} having one descent. 

In the general case of an arbitrary alphabet, another possible formulation is 
the following. Let us say that a word 6162 * " • b n is co- Lyndon if the permutation 
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7r built by Algorithm Permutation is an n-cycle. It is clear that the map 
w i ► T(w) is one-to-one from the set of Lyndon words of length n on A onto 
the set of co-Lyndon words of length n on A. 

The properties of co-Lyndon words have never been studied and this might 
be an interesting direction of research. 

Example 2 The following array shows the correspondance between Lyndon 
and co- Lyndon words of length 5 on {a, b}. The permutation ir is shown on the 
right. 
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